Goto

Collaborating Authors

 Lee County


Sequence-Level Certainty Reduces Hallucination In Knowledge-Grounded Dialogue Generation

Wan, Yixin, Wu, Fanyou, Xu, Weijie, Sengamedu, Srinivasan H.

arXiv.org Artificial Intelligence

Model hallucination has been a crucial interest of research in Natural Language Generation (NLG). In this work, we propose sequence-level certainty as a common theme over hallucination in NLG, and explore the correlation between sequence-level certainty and the level of hallucination in model responses. We categorize sequence-level certainty into two aspects: probabilistic certainty and semantic certainty, and reveal through experiments on Knowledge-Grounded Dialogue Generation (KGDG) task that both a higher level of probabilistic certainty and a higher level of semantic certainty in model responses are significantly correlated with a lower level of hallucination. What's more, we provide theoretical proof and analysis to show that semantic certainty is a good estimator of probabilistic certainty, and therefore has the potential as an alternative to probability-based certainty estimation in black-box scenarios. Based on the observation on the relationship between certainty and hallucination, we further propose Certainty-based Response Ranking (CRR), a decoding-time method for mitigating hallucination in NLG. Based on our categorization of sequence-level certainty, we propose 2 types of CRR approach: Probabilistic CRR (P-CRR) and Semantic CRR (S-CRR). P-CRR ranks individually sampled model responses using their arithmetic mean log-probability of the entire sequence. S-CRR approaches certainty estimation from meaning-space, and ranks a number of model response candidates based on their semantic certainty level, which is estimated by the entailment-based Agreement Score (AS). Through extensive experiments across 3 KGDG datasets, 3 decoding methods, and on 4 different models, we validate the effectiveness of our 2 proposed CRR methods to reduce model hallucination.


KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection

Choi, Sehyun, Fang, Tianqing, Wang, Zhaowei, Song, Yangqiu

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable human-level natural language generation capabilities. However, their potential to generate misinformation, often called the hallucination problem, poses a significant risk to their deployment. A common approach to address this issue is to retrieve relevant knowledge and fine-tune the LLM with the knowledge in its input. Unfortunately, this method incurs high training costs and may cause catastrophic forgetting for multi-tasking models. To overcome these limitations, we propose a knowledge-constrained decoding method called KCTS (Knowledge-Constrained Tree Search), which guides a frozen LM to generate text aligned with the reference knowledge at each decoding step using a knowledge classifier score and MCTS (Monte-Carlo Tree Search). To adapt the sequence-level knowledge classifier to token-level guidance, we also propose a novel token-level hallucination detection method called RIPA (Reward Inflection Point Approximation). Our empirical results on knowledge-grounded dialogue and abstractive summarization demonstrate the strength of KCTS as a plug-and-play, model-agnostic decoding method that can effectively reduce hallucinations in natural language generation.


'AI Jesus' talks dating, relationships, morals -- even offers video-gaming tips

FOX News

AI technology is quickly creeping into every industry, prompting new questions about whether online content comes from a human or a computer. A chatbot "version" of Jesus Christ called "Ask_Jesus" is streaming on the gaming platform Twitch -- and it stands ready to answer questions from humans on anything from morality issues to the video game Fortnite to super-powered rodents. Shown with wavy, brown hair and a beatific expression, accompanied by a calm, well-modulated voice, "AI Jesus" calls users on the platform by name -- and appears to consider with care each question asked, as YouTube videos of livestreams reveal. "I am AI Jesus, here to share wisdom based on Jesus' teachings, and help answer questions related to spirituality, personal growth and other wholesome topics," AI Jesus can be heard saying in a video recording of a recent livestream posted to YouTube by Fara Jakari. AI HAS POWER TO'MANIPULATE' AMERICANS, SAYS SEN.


Search-Engine-augmented Dialogue Response Generation with Cheaply Supervised Query Production

Wang, Ante, Song, Linfeng, Liu, Qi, Mi, Haitao, Wang, Longyue, Tu, Zhaopeng, Su, Jinsong, Yu, Dong

arXiv.org Artificial Intelligence

Knowledge-aided dialogue response generation aims at augmenting chatbots with relevant external knowledge in the hope of generating more informative responses. The majority of previous work assumes that the relevant knowledge is given as input or retrieved from a static pool of knowledge. However, this assumption violates the real-world situation, where knowledge is continually updated and a chatbot has to dynamically retrieve useful knowledge. We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation. As the core module, a query producer is used to generate queries from a dialogue context to interact with a search engine. We design a training algorithm using cheap noisy supervision for the query producer, where the signals are obtained by comparing retrieved articles with the next dialogue response. As the result, the query producer is adjusted without any human annotation of gold queries, making it easily transferable to other domains and search engines. Experiments show that our query producer can achieve R@1 and R@5 rates of 62.4% and 74.8% for retrieving gold knowledge, and the overall model generates better responses over strong knowledge-aided baselines using BART and other typical systems.


Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases

Weikum, Gerhard, Dong, Luna, Razniewski, Simon, Suchanek, Fabian

arXiv.org Artificial Intelligence

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.


K & K Technical (@KK_Technical)

#artificialintelligence

Are you sure you want to view these Tweets? How will self-driving cars disrupt the auto industry? Get the inside track and land your new job with these career seeking strategies http://goo.gl/rO8Qch We hope the tragedy in Tianjin reminds all of us in manufacturing, to carefully adhere to all safety procedures http://goo.gl/MEbQr6 More and more engineers are emerging as successful business leaders across the U.S. http://goo.gl/wT8smy